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ABSTRACT 

We describe a Bayesian framework for estimating the time-domain noise covariance of CMB obser- 
vations, typically parametrized in terms of a 1// frequency profile. This framework is based on the 
Gibbs sampling algorithm, which allows for exact marginalization over nuisance parameters through 
conditional probability distributions. In this paper we implement support for gaps in the data streams 
and marginalization over fixed time-domain templates, and also outline how to marginalize over con- 
fusion from CMB fluctuations, which may be important for high signal-to-noise experiments. As a 
by-product of the method, we obtain proper constrained realizations, which themselves can be useful 
for map making. To validate the algorithm, we demonstrate that the reconstructed noise parame- 
ters and corresponding uncertainties are unbiased using simulated data. The CPU time required to 
process a single data stream of 100000 samples with 1000 samples removed by gaps is 3 seconds if 
only the maximum posterior parameters are required, and 21 seconds if one also want to obtain the 
corresponding uncertainties by Gibbs sampling. 

Subject headings: cosmic microwave background — cosmology: observations — methods: statistical 
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1. INTRODUCTION 

Detailed observations of the cosmic microwave back- 
ground (CMB) during the last two decades have revolu- 
tionized cosmology. Through detailed measurements of 
the angular CMB power spectrum, a highly successful 
cosmological concordance model has been established, 
stating that the universe is statistically isotropic and 
homogeneous, filled with Gaussian random fluctuations 
drawn from a ACDM spectrum, and consists of 4% bary- 
onic matter, 23% dark matter and 73% dark energy (e.g., 
Komatsu et al. 2011, and references therein). Using this 
model, millions of data points from many different types 
of cosmological observations can be fitted with only six 
free parameters. 

This success has been driven primarily by rapid 
progress in CMB detector technology, allowing experi- 
mentalists to make more and more detailed maps of the 
CMB fluctuations. However, such maps are imperfect, in 
the sense that they typically are contaminated by various 
instrumental effects. For instance, the optics of a given 
experiment can be asymmetric; the detector gain may 
be unknown and time-dependent; the data may exhibit 
resonant frequencies due to electronics or cooling non- 
idealities; and the observations are invariably noisy. All 
these non-idealities must be properly understood before 
one can attempt to extract cosmology from the observa- 
tions. 

In this paper we consider one specific component 
within this global calibration problem, namely how to 
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estimate the statistical properties of the instrumental 
noise in light of real- world complications. This prob- 
lem has of course already been addressed repeatedly in 
the literature (e.g., Prunet et al. 2001; Hinshaw et al. 
2003), and our method is in principle similar to that 
of Ferreira & Jaffe (2000), taking a Bayesian approach 
to the problem. The main difference is that we formu- 
late the algorithm explicitly in terms of a Gibbs sampler 
including both the time stream and the noise parame- 
ters as unknown variables, and this has several distinct 
advantages. First, it allows us to obtain proper uncer- 
tainties on all derived quantities. Second, gap filling is 
directly supported through built-in proper constrained 
realizations. This can for instance be used to account 
for instrumental glitches in the time stream, or to ex- 
clude point sources and other bright sources from the 
analysis. Third, it is straightforward to add support for 
additional nuisance parameters, due to the conditional 
nature of the Gibbs sampler. In this paper we imple- 
ment template marginalization, which may for instance 
be useful for removing cosmic ray glitches in the Planck 
HFI data (Planck 2011b) or ground pickup for ground 
based experiments (e.g., QUIET 2011). We also out- 
line the formalism for marginalization over CMB fluctu- 
ations, which may be relevant for experiments with high 
signal-to-noise ratio. 

The method presented here is mathematically identi- 
cal to the CMB Gibbs sampling framework developed 
by Jewell et al. (2004); Wandch et al. (2004); Eriksen 
et al. (2004, 2008), and the main difference is simply 
that the object under consideration is a one-dimensional 
time stream instead of a two-dimensional field on the 
sphere. This makes the implementation considerably 
simpler, and the run times correspondingly faster. As 
a demonstration of the practicality of the method, we 
apply it to simulated data with properties typical for 
current ground-based experiments, and demonstrate ex- 
plicitly that the computational costs of the method are 
tractable. The experiment of choice will be QUIET 
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(2011), for which this method was initiahy developed. 

2. DATA MODEL 

The first step of any Bayesian analysis is to write down 

an explicit parametric model for the observations in ques- 
tion. In this paper, we start with the assumption that 
the output, d, from a given detector can be written in 
terms of the following sum, 

d = n + Ps + Ta + m. (1) 

Here each term indicates a vector of n values sampled 
regularly in time in steps of Af, that is, d = {di} with 
i = l,...,N. 

The first term on the right-hand side, n, indicates the 

instrumental noise, which is our primary target in this 
paper. All the other components are only nuisance vari- 
ables that we want to marginalize over. 

We assume that the noise is Gaussian distributed and 
stationary over the full time range considered. In prac- 
tice this means that the full data set of a given ex- 
periment should be segmented into parts which are in- 
dividually piecewise stationary. For QUIET this cor- 
responds to division into so-called "constant elevation 
scans" (QUIET 2011), while for Planck it corresponds 
to division into so-called "rings" , which are one-hour ob- 
servation periods with a fixed satellite spin axis (Planck 
2011a). Because the noise is assumed stationary, the 
time-domain noise covariance matrix, N, depends only 
on the time lag between two observations, Ntf = N(t — 
t'): It is a Toeplitz matrix, and may therefore be well ap- 
proximated in Fourier domain with a simple diagonal ma- 
trix, TVyy/ = N^Svvi . Here is the Fourier-domain noise 
power spectrum, which is given by the Fourier transform 
oiN{t-t'). 

Our main task is to estimate N,^^ and we do so in terms 
of a parametrized function. For many experiments this 
function is well approximated by a so-called 1// profile, 







1 + 






\ /knee / 



which describes a sum of a correlated and an uncorre- 
lated noise component in terms of three free parameters. 
The white- noise RMS level, ao, defines the overall am- 
plitude of the noise; the knee frequency, /knee) indicates 
where the correlated and the uncorrelated components 
are equally strong, and a is the spectral index of the cor- 
related component. Collectively, we denote {(Jq^ a, /knee} 
by 9. Of course, other parametrizations may easily be 
implemented if necessary. 

The second term on the right-hand side, Ps, indicates 
the contribution from the CMB sky, with P being a 
pointing matrix, typically eqiial to zero everywhere ex- 
cept at Pip if the detector points towards pixel p at time 
z, and Sp is the true (beam convolved) CMB signal. We 
make the usual assumption that s is isotropic and Gaus- 
sian distributed with a given angular power spectrum, 
Q. In this paper, we will simply outline the formalism 
for how to deal with this term, and leave the implemen- 
tation for a future paper dedicated to Planck analysis; 
as mentioned in the introduction, this machinery was 
initially developed QUIET, which is strongly noise dom- 
inated for a single data segment, and the CMB com- 
ponent is therefore not important, as will be explicitly 



demonstrated in this paper. 

The third term is a sum over ntcmp time-domain tem- 
plates. These can be used to model several different types 
of nuisance components. Three examples are diffuse fore- 
grounds and cosmic ray glitches for Planck, and ground 
pick-up for QUIET. In either case, we assume in this pa- 
per that the template itself is perfectly known, and the 
only free parameter is an overall unknown multiplicative 
amplitude a. This is a vector of length ntemp, and T is 
the two-dimensional nxritemp matrix listing all templates 
column- wise. 

Finally, the fourth term on the right-hand side of Equa- 
tion 1 denotes a time-domain mask, m. This is imple- 
mented by a "Gaussian" component having zero variance 
for samples that are not masked, and infinite variance 
for samples that are masked. In order to make analytic 
calculations more transparent, we write the the corre- 
sponding covariance matrix as a diagonal matrix with 
elements 

_ i a, i not masked ,„s 
" ~ 1 e, i masked ' ^ 

where a ^ oo and e ^ 0. A typical application of this 
component is to remove periods of instrumental glitch- 
ing, or to discard particularly bright observations when 
the telescope points towards bright astrophysical sources, 
such as point sources or the Galactic plane. 

3. GIBBS SAMPLING AND THE POSTERIOR 

Our primary goal is now to map out P{9\d), the noise 
spectrum posterior distribution marginalized over all nui- 
sance components. By Bayes' theorem this distribution 

p(«id) = ^<mo<Aw«), (4) 

where C{9) — P{d\9) is the likelihood, P{9) is a prior on 
9, and P{d) is an irrelevant normalization constant. 

In this paper we adopt for simplicity uniform priors on 
ao, 01 and /knee- For typical relevant time series which 
contain ^ 10^ samples, these parameters axe usually 
strongly data-driven, and the choice of priors is there- 
fore irrelevant. However, if an informative prior (or the 
Jeffreys' prior) is desired for a given application, it is 
straightforward to include this as indicated by Equa- 
tion 4. 

Since we assume that the noise is Gaussian distributed 
with covariance N(^), the likelihood is given by 

p-in^N-i(e)n 
L{9) (X , (5) 

where n = d — Ps — Fa — m is the noise component of 
the data stream. The goal is to compute this distribution, 
marginalized over s and a, while at the same time taking 
into account possible gaps in the data. 

The latter point touches on an important computa- 
tional issue. If there are no gaps in the data, then N is a 
Toeplitz matrix, and multiplication with N is performed 
most efficiently in Fourier space. However, the same docs 
not hold if there are gaps in d, since the symmetry of N 
is broken. The well-known solution to this problem is to 
fill the gap with a constrained noise realization with the 
appropriate spectrum (e.g., Hoffman & Ribak 1991). In 
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our formulation, this is equivalent to estimating n jointly 
with 9. 

More generally, we want to estimate the joint density 
P(n, m, s, a, 0|d), from which any desired marginal may 
be obtained. At first sight, this appears like a formidable 
computational problem, involving more than 10"" free pa- 
rameters. However, this is also a problem that may be 
tackled by means of the statistical technique called Gibbs 
sampling, which has already been described in detail for 
computing the CMB angular power spectrum with con- 
taminated data by Jewell et al. (2004); Wandelt et al. 
(2004); Eriksen et al. (2004, 2008). 

According to the theory of Gibbs sampling, samples 
from a joint distribution may be obtained by iteratively 
sampling from each corresponding conditional distribu- 
tion. For our case, this leads to the following sampling 
scheme, 



m, n P(m, n|s, a, 9, d) 
s, n ^ P(s, n|a, 9, m, d) 
a, n P(a, n\9, m, s, d) 
9 <- P(6'|n,m,s,a, d) 



(6) 
(7) 
(8) 
(9) 



The symbol indicates sampling from the distribu- 
tion on the right-hand side. With this algorithm, 
(n, m, s, a, 6') ' will be drawn from the correct joint dis- 
tribution. 

Note that each of the sampling steps that involve time- 
domain vectors are joint steps including the noise com- 
ponent itself. This approach is highly computationally 
advantageous as it allows for fast multiplication with N 
in Fourier domain; conditional algorithms for sampling 
each component separately would require slow convolu- 
tions in time domain. Of course, it is fully acceptable 
within the Gibbs sampling machinery to sample some 
components more often that others. 

Note also that if we are only interested in the joint 
maximum-posterior parameters, we can replace the rele- 
vant steps in the above algorithm by a maximization op- 
eration, such that we maximize the conditional instead 
of sampling from it. The algorithm then reduces to a 
typical iterative approach, but formulated in a conve- 
nient and unified statistical language. The advantage of 
this approach is computational speed, while the disad- 
vantage is the loss of information about uncertainties. 
Both versions of the algorithm will be implemented and 
demonstrated in the following. 

4. SAMPLING ALGORITHMS 

Equations 6-9 defines the high-level algorithm in terms 
of conditional sampling steps. To complete the method, 
we have to establish efficient sampling algorithms for 
each conditional distribution. 

4.1. Noise estimation with ideal data 

Perhaps the most fundamental conditional distri- 
bution in the sampling scheme outlined above is 
P{9\n, m, s, a, d). This describes the distribution of the 
noise parameters given perfect knowledge about all com- 
ponents of the data. To obtain an explicit expression for 
this distribution, we first note that P(0|n, m, s, a, d) = 
P(0|n); if we know the true noise component, n, no fur- 
ther information about either the CMB signal, the tem- 




10 20 30 40 50 

Chain distance 

Fig. 1. — Correlation function for a of the Metropolis sampler 
employed to sample from P{8\n) with a curvature matrix based 
proposal density. Similar plots for ctq and /kncG look visually the 
same. The correlations fall below 10% at a lag of ~20 samples, and 
wc adopt a thinning factor of ~20 samples to suppress correlations, 
given that the computational cost of this sampling step is lower 
than the constrained realization sampler. 

plate amplitudes, or, indeed, the actual data is needed 
in order to estimate the noise parameters. 

The expression for the conditional distribution P{9\n) 
is then formally the same as that for P(6'|d) given by 
Equations 4 and 5. Writing this out in Fourier space 
for the 1/f profile discussed in Section 2, one finds the 
following explicit distribution for 9 ~ {(jQ.a, /knee}, 



lnP(cro,a, /k„ee|n) = - lnP((To, a, /k 
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(10) 



Here are the power spectrum components of the data 
n, while N^, = iV;y(cro, Q^, /knee) is the covariance matrix 
which in Fourier space is diagonal and given by Equa- 
tion 2. The first term on the right-hand side is a user- 
defined prior. 

To sample from this distribution, we use a stan- 
dard Metropolis sampler with a Gaussian proposal den- 
sity (e.g., Liu 2001). Each chain is initialized at the 
maximum-posterior point, which is found by a non-linear 
quasi-Newton search, and the covariance matrix of the 
Gaussian proposal density is taken to be the square root 
of the curvature matrix, evaluated at the maximum- 
posterior point. The elements of the inverse curvature 
matrix, C'^ = - log P{9\n)/de^d9j, read 

.a . — 
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(11) 

We have found that this proposal density leads to a 
Markov chain correlation length of about 20 samples 
for typical parameter values, and we therefore thin our 
chains by this amount. In addition, we remove a few 
post-thinned samples at the beginning of the chain to re- 
move potential burn-in, although we have never seen evi- 
dence of any such effects. Finally, if we only want to find 
the maximum-posterior point, and not run a full-blown 
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Fig. 2. — Constrained realizations through a gap in the data. The 
extent of the gap is indicated by the two vertical dashed lines; the 
solid black curve shows the input data, which is dominated by a 
point source (thin spikes) in the masked out region. Five different 
constrained realizations are shown by colored lines. These depend 
only on the data outside the mask, and are therefore not affected 
by the point source. 



Gibbs chain, as discussed in Section 3, we terminate the 
process directly after the initial quasi-Newton search. 

4.2. Gap filling by constrained realizations 

Next, we need to establish a sampling algorithm for 
P(n, m|s, a, 9, d) — P(n, m|r, 6), where the residual r = 
d — Ps — Ta = n + m. Note that r and n differ only 
by m, which represents contaminated segments of data 
that are masked out. The problem is therefore reduced 
to sampling the components of a sum of Gaussians given 
the sum itself. This may be achieved efficiently by solving 
the equation 

[N"^ + M"^]n = M"^r + N"5v + M"5w (12) 

for n, where v and w are vectors of standard iV(0, 1) 
Gaussian random variates (e.g., Jewell et al. 2004; Wan- 
delt et al. 2004; Eriksen et al. 2004). 

The matrices involved here are typically of the order 
10^ X 10^, so the equation cannot be solved by brute 
force. But since M is diagonal in time domain and N is 
diagonal in frequency domain (due to its Toepliz nature), 
all multiplications are cheap in either time or Fourier do- 
main, and the equation can be efficiently solved with the 
Conjugate Gradients method, properly changing basis as 
needed. 

However, there is one practical complication involved 
in Equation 12: Since we intend to let e — >■ and a — )■ oo, 
the matrix + M"^ becomes infinitely poorly condi- 
tioned. We can solve this problem, as well as significantly 
simplifying the equation, by splitting it into one equation 
for the masked region, and one for the unmasked region. 
Introducing the notation xi and X2 for the unmasked 
and masked subsets of a vector x respectively, we find 

(N-in)i + e^^ni = er^ri + (N"5v)i + er^wi (13) 
(N~^n)2 + a^^na = a^^ra + (N"3v)2 + a^^W2 (14) 



which in the limit e — !■ 0, a — !■ cx) simplifies to 

ni=ri (15) 

(N-in)2 = (N-H)2 (16) 

Note that equations 15-16 form an asymmetric equation 
system^. The Conjugate Gradients method is therefore 
not directly applicable, and the more general Biconjugate 
Gradients method must be used instead. 

We found a simple diagonal preconditioner with value 
Var(n) inside the mask and value 1 outside it to be suf- 
ficient. 

4.3. Marginalization over fixed templates 

The sampling algorithms described in Sections 4.1 and 
4.2 defines the core noise Gibbs sampler, and together 
form a well-defined and complete noise estimation al- 
gorithm for low signal-to-noise data with gaps. How- 
ever, one of the main advantages of the Gibbs sampling 
algorithm compared to other alternatives is its natural 
support for marginalization over nuisance parameters. 
In this section we describe the sampling algorithm for 
marginalization over fixed time-domain templates, de- 
scribing for instance ground pickup, diffuse foregrounds 
or cosmic ray glitches. 

First, we note that while the original data stream, d, 
may contain gaps, and the Toeplitz nature of the noise 
covariance matrix is in that case broken, the constrained 
realization produced in Section 4.2 restores the Toeplitz 
symmetry. It is therefore computationally advantageous 
to use q = d — Ps — m = n -f Ta as the data in this 
step^, so that P(a, n|6', m, s, d) = P(a, n|q, 9). 

Starting with Equation 5, solving for a and complet- 
ing the square in the exponential, the appropriate con- 
ditional distribution for a is found to be the well-known 
distribution 

P(a|q, 9) OC e-5(a-a)'(T'N-T)(a-a)^ (^7) 

where a = (T*N-iT)-iT*N-iq; that is, P(a|q,6') is 
a Gaussian distribution with mean a and covariance 
Ca = (T*N'iT)-^ The same result has been derived 
for numerous other applications, one of which was de- 
scribed by Eriksen et al. (2004), outlining template am- 
plitude sampling with CMB sky map data. 

Sampling from this distribution is once again straight- 
forward. If one only wish to marginalize over a small 
number of templates, the easiest solution is simply to 
compute both a and Ca by brute-force, and let a'+^ — 
a-t- LaJy, where La is the Cholesky factor of Ca — LaL^, 
and 77 is a vector of uncorrelated standard Gaussian 
iV(0, 1) variates. On the other hand, if there are more 
than, say, 1000 templates involved, it may be faster to 

^ Equations 15-16 can be rewritten as 



■ 1 ■ 






.^21' ^22'. 







This is asymmetric because we multiplied the unmasked part of 
the equation by e in order to get a finite result. 

^ Of course, one could write down a sampling algorithm for a 
that only uses the non-masked data directly, but this would require 
heavy time-domain convolutions, and not take advantage of the 
symmetries inherent in the noise covariance matrix. 




Fig. 3. — Recovered noise parameters from 10 000 noise-only simulations with gaps. The dashed horizontal lines indicate the true input 

value. 



solve the following equation by Conjugate Gradients, 



(18) 



where w is a full time-stream of N{0, 1) random variates. 
In this paper, which has the QUIET experiment as its 
main application, we only use the former algorithm; for 
Planck the latter algorithm may be useful in order to 
account for frequent and partially overlapping cosmic ray 
glitches efficiently. 

4.4. CMB signal marginalization 

Most CMB experiments are strongly noise-dominated 
within relatively short time periods, and need to inte- 
grate over the sky for a long time in order to produce 
high-sensitivity maps. For instance, for the first-sciason 
QUIET experiment the mean polarized sensitivity of a 
detector was 280/iKv^ (QUIET 2011), while the polar- 
ized CMB sky has an RMS of ^ 1/xK on the relevant 
angular scales. In such cases, it is an excellent approx- 
imation simply to ignore the CMB contribution when 
estimating the noise parameters. However, this approx- 
imation does not hold for all experiments, and one par- 
ticularly important counterexample is Planck. 

In order to marginalize over the CMB signal we need to 
be able to sample from P(s, n|a, 0, m, d) = P(s,n|u, 0), 
where the residual isu = d — m — Fa = Ps + n. This is 
again a case that involves sampling terms of a Gaussian 
sum given the sum itself. The only difference from Sec- 
tion 4.2 is that a projection operator is involved in this 
case, 

(S-^ + P'^N-^P)s = P^N-^u + S-5v + P^N-5w 

(19) 

Here v and w are vectors of standard normal samples 
in pixel and time domain respectively. Note that M is 
not involved this time, so the expression can be used 
as it is. As before, multiplications involving N are ef- 
ficient in Fourier space due to the Toeplitz nature of 
N, but the CMB signal covariance, S, will in general 
be dense, depending on the true CMB power spectrum. 
This makes this method prohibitively expensive for gen- 
eral asymmetric scanning strategies. There are, however, 
circumstances for which also this multiplication becomes 
efficient. One obvious case is that of full sky coverage, 



where a change to spherical harmonic basis makes S di- 
agonal. 

More interestingly, S also becomes diagonal when ex- 
pressed in Fourier basis on a circle on the sky. To 
see this, consider a circle with radius 9 on the sphere, 
parametrized by the angle (f). The covariance between 
the points pi = (6, (pi) and p2 — {0,4>2) with angular 
distance r is given by the two-point correlation function: 



1 °° 

S{pi,Pi) = C{r) = — Y^{21 + l)CiPi{cos{r)) 



(20) 



Since r{pi,p2) for the case of a circle only depends on 
A(/), and is independent of <f) itself, 5(^1,(^2) is a Toepliz 
matrix, and is therefore diagonal in Fourier space. This 
is highly relevant for Planck, since the Planck scanning 
strategy (Planck 2011a) naturally divides into scans of 
circles. Therefore, in this case sampling s ^ F(s|u) can 
be done at very low extra cost. 

5. APPLICATION TO SIMULATED DATA 

In this section we demonstrate the noise Gibbs sampler 
as described in Section 4 on a particular type of QUIET 
simulations. QUIET is a radiometer-based CMB B-mode 
polarization experiment located in the Atacama desert 
(QUIET 2011), which took observations from August 
2008 to December 2010. The first results were based on 
only nine months worth of data, and yet already provided 
the second most stringent upper limit on the tensor-to- 
scalar ratio, r, to date based on CMB polarization mea- 
surements. 

In its normal mode of operation, QUIET observed four 
separate CMB fields on the sky which were each chosen 

because of their low foreground levels. In this mode, the 
experiment is totally noise dominated on time scales of 
an hour or less, with a mean polarized sensitivity per 
detector diode of 2m^iK^ (QUIET 2011). 

However, QUIET also observed two Galactic patches, 
one of which was the Galactic center, as well as several 
bright calibration objects, such as the Moon, Tau Alpha 
and RCW38. These sources are bright enough to be seen 
visually in each detector time stream, and they can there- 
fore bias any noise estimates unless properly accounted 
for. Such objects also complicate automated data selec- 
tion processes, since it is difficult to distinguish between 
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TABLE 1 
Validation by simulations 



Simulation ctq (10 ^V) a 


/knee (IQ-lRz) 




Posterior maximization (absolute parameter values) 


Gaps only 1.000 ± 0.003 -1.80 ±0.06 
Gaps + uncorrected CMB 1.000 ± 0.003 -1.80 ± 0.06 
Gaps + uncorrected ground pickup 1.006 ± 0.003 -1.52 ± 0.05 
Gaps + corrected ground pickup 1.000 ± 0.003 -1.80 ± 0.06 


1.00 ±0.05 
1.00 ± 0.05 
1.51 ±0.08 
1.00 ± 0.05 


Gibbs sampling (normalized parameter values) 


Gaps only -0.02 ±1.00 0.01 ± 1.01 
Gaps + uncorrected CMB -0.02 ± 1.00 0.00 ± 1.01 
Gaps + uncorrected ground pickup 1.78 ± 1.03 6.04 ± 1.36 
Gaps + corrected ground pickup -0.02 ± 1.00 0.00 ± 1.00 


0.03 ± 1.01 
0.02 ± 1.01 
6.63 ± 0.77 
0.03 ± 1.00 



Note. — Summary of recovered noise parameters from various simulated ensembles. 
Each column indicates the mean and standard deviation of the resulting parameter 
distribution. The top section shows results obtained when simply maximizing the 
posterior, while the bottom section shows the results for a full Gibbs sampling analysis. 
Each run in the top section consists of 10000 simulations, while each run in the bottom 
section consists of 5000 simulations. All runs have been started with the same seed, to 
ensure directly comparable results. 

an astrophysical object and an instrumental glitch. 

In this section wc show how the algorithm developed 
in Section 4 may be applied to such situations. Specifi- 
cally, we consider a observing session lasting for about 40 
minutes of a field including a bright source with known 
location, and assume that the data may be modeled as 
d = n + Ta + m. Here T is a single template describing 
possible sidelobe pick-up from the ground, constructed 
from the full observing season as described by QUIET 
(2011), and m is a time-domain mask that removes any 
samples that happen to fall closer than 1° from the source 
center. The total number of samplers in the data stream 
is 60 949, and the total miniber of masked samples is 651. 

The simulations used in this section are constructed 
as follows. We set up an ensemble of 10^ time streams 
containing correlated Gaussian random noise with ctq = 
10~^V, a = —1.8 and /k = O.lHz; the white noise and 
spectral index are representative for a QUIET detector, 
while the knee frequency is grossly exaggerated to push 
the algorithm into a difficult region of parameter space, 
as well as to more clearly visualize the outputs of the 
algorithm. A far more reasonable value for QUIET is 
/k = lOmHz, and we have of course verified that the 
algorithm also works for such cases. Further, it reaches 
convergence faster in that case than with the extreme 
value of /k used in the present simulations. 

5.1. Visual inspection of constrained realizations 

Before considering the statistical properties of the re- 
sulting posterior distributions, it is useful to look visu- 
ally at a few constrained realizations in order to build 
up intuition about the algorithm. In order to highlight 
the behavior of the constrained realizations, we make 
two adjustments to the above simulation procedure for 
this case alone: First, we replace the tuned mask with a 
wide 6000-sample mask, covering the entire time range in 
which the source is visible, and second, we make the cor- 
related noise component stronger by setting ctq — lO^^V, 
/knee = IHz and a = -2.3. 

The results are shown in Figure 2. The raw data are 
shown in the solid black line, and the vertical dashed 



lines indicate the extent of the gap. The colored curves 
within the gap shows 5 difference constrained realiza- 
tions; note that together with the black solid curve out- 
side the mask, any of these form a valid noise realization 
with the appropriate noise power spectrum as defined by 
(TO) ct and /knee- They are each a valid sample drawn 
from P(n, m|d). However, if one had tried to estimate 
the noise spectrum also using the data inside the gap, the 
source signal (seen as sharp spikes in Figure 2) would bias 
the resulting noise parameters. 

In this paper, we consider the constrained realizations 
primarily to be a useful tool that allows for fast noise 
covariance matrix multiplications in Fourier space. How- 
ever, these constrained realizations can of course also be 
useful in their own right, for instance for deglitching a 
time stream before map making. 

5.2. Validation and statistical characterization 

We now seek to statistically validate our algorithms 
and codes. Both the posterior maximization and the 
Gibbs sampling algorithms are considered. The number 
of simulations are 10000 for posterior maximization and 
5000 for Gibbs sampling, with properties as described 
above. In each case, we consider four different models. 
First, we analyze simulations including only noise and 
gaps. Second, we add a CMB signal to each realiza- 
tion, but do not attempt to correct for it. Third, we 
add a strong ground template to each realization, and 
do also not attempt to correct for it. Fourth, we analyze 
the same ground-contaminated simulations as above, but 
this time do marginalize over an appropriate template. 
The same random seeds were used in each of the four sim- 
ulation and analysis results, in order to allow for direct 
comparison of results between runs. 

The results from this exercise are summarized in Ta- 
ble 1, and histograms for the first case are shown in 
Figure 3. First, the the upper section in Table 1 lists 
the mean and standard deviation of the recovered pa- 
rameters for the posterior maximization algorithm. Re- 
call that the input parameters were (ctq, a, /knee) = 
(10~^V, —1.8,0. IHz), and these are recovered perfectly 



7 




1 2 3 

Number of samples (10^) 



20 40 60 80 
Number of gaps 



J ^ \ ^ L 

3 

4 

Length of gap (10 ) 



Fig. 4. — Left: CPU time as a function of total number of samples in the time stream, keeping the number of masked samples constant. 
Middle: CPU time as a function of number of masked samples. The size of each gap is kept constant at 100 samples, and only the number 
of gaps vary. Right: CPU time as a function of number of masked samples, but in this case there is only one gap of varying width. 



in all cases, except for the one involving an uncorrected 
ground template, as expected. 

Second, in the bottom half we show the results from 
the Gibbs sampling analyses, but this time in terms of 
normalized parameters on the form r = {0est — ^in)/o'est) 
where ^est and CTgst are the mean and standard deviation 
of the Gibbs chain for a given parameter (removing the 
first 10% of the samples for burn-in), and Cin is the true 
input value. If the Gibbs chain is both unbiased and has 
the correct dispersion, r should be Gaussian distributed 
with zero mean and unit variance. As seen in Table 1, 
this is indeed the case. 

We also note that adding a CMB component to these 
simulations do not bias the noise estimates, simply be- 
cause the CMB is too weak to be detected on the time 
scales considered here. This confirms the assumption 
made by the QUIET team when estimating their noise 
properties: The QUIET observations are sufficiently 
noise dominated on a one-hour time scale that the CMB 
can be safely neglected for noise estimation purposes. 

5.3. Resource requirements 

To be practical, it is not sufficient that a method is ro- 
bust and accurate, but it must also be computationally 
efficient. For the present algorithm the two most im- 
portant parameters for computational speed are 1) the 
total number of samples in the time stream, n, and 2) 
the number of masked samples, m, while also the relative 
position of the masked samples play an important role. 

In Figure 4 we show the scaling of each of the two algo- 
rithms (posterior maximization and Gibbs sampling) as 
a function of both n (left panel) and m (right panel). In 
the left plot, m was fixed at 1000, divided into ten gaps 
of 100 samples each, and only the total length of the 



data stream was varied. In this case, we should expect 
the scaling of the overall algorithm to be dominated by 
Fourier transforms, suggesting an overall behavior given 
by O(n\ogn). As seen in Figure 4, this approximation 
holds to a very high degree, both for posterior maximiza- 
tion and Gibbs sampling. Further, we see that the CPU 
time required to analyze a single 100 000 sample data 
set with 1000 samples removed is 3 seconds for posterior 
maximization and 21 seconds for Gibbs sampling. 

In the middle panel, we fix n at 100 000, and increase 
m by varying the number of gaps, each extending 100 
samples. Perhaps somewhat surprisingly, we see that the 
computing time in this case is nearly independent of m. 
The reason for this is simply that the number of conju- 
gate gradient itcirations required for the gap filling proce- 
dure is largely determined by condition number (ie., the 
ratio between the highest and smallest eigenvalue) of the 
covariance matrix of a single gap. Having more gaps sep- 
arated by more than one time-domain correlation length 
effectively corresponds to performing multiple matrix in- 
versions in parallel, and the net cost therefore do not 
increase significantly. 

In the third panel, we increase m by making one gap 
larger, as opposed to adding many small gaps. In this 
case the CPU time does increase dramatically, because 
it becomes increasingly hard for the algorithm to fill the 
missing pieces of the data stream. In this case, the noise 
covariance matrix condition becomes larger and larger. 

6. SUMMARY 

We have described and implemented a Bayesian frame- 
work for estimating the time-domain noise power spec- 
trum for non-ideal CMB experiments. This framework is 
conceptually identical to a previously described method 
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for estimating the angular CMB power spectrum from 
CMB sky maps (Jewell et al. 2004; Wanclelt et al. 2004; 
Eriksen et al. 2004), and relies heavily on the Gibbs sam- 
pling algorithm. The single most important advantage 
of this method over existing competitors in the literature 
derives from the conditional nature of the Gibbs sampler: 
Additional parameters may be introduced conditionally 
into the algorithm. This allows for seamless marginal- 
ization over nuisance parameters, which otherwise may 
be difficult to integrate. A second important advantage 
of the method is the fact that it provides proper un- 
certainties on all estimated quantities, which at least in 
principle later may be propagated into final estimates of 
the uncertainties of the CMB sky map and angular power 
spectra. 

In this paper we implemented support for two gen- 
eral features that are useful for analysis of realistic data, 
namely constrained realizations and template sampling. 
The former is usehil whenever there are gaps in the data, 
for instance due to an instrumental glitch, or there are 
strong localized sources in the sky that may bias the 
noise estimate: In these cases, the gaps are refilled with 
a constrained noise realization with the appropriate noise 
parameters, such that the full time stream represents a 
proper sample from a Gaussian distribution with a noise 
covariance matrix, N. Since the time stream no longer 
contains gaps, the Toeplitz symmetry of the noise covari- 
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ance matrix is restored, and matrix miiltiplications may 
be performed quickly in Fourier space. 

The second operation, template sampling, is also a 
powerful and versatile technique for mitigating system- 
atic errors. In this paper we mostly focused on data 
from the QUIET experiment, for which ground pickup 
from sidelobes is one significant source of systematics 
(QUIET 2011). In a future publication we will apply the 
same method to simulations of the Planck experiment, 
for which cosmic ray glitches is an important source of 
systematic errors. As detailed by Planck (2011b), these 
cosmic rays may be modeled in terms of a limited set 
of time domain templates, and the algorithms presented 
in this paper should therefore prove useful for mitigating 
the effects of these glitches, as well as for propagating the 
corresponding uncertainties into the final noise spectrum 
parameters. 
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